Software Clustering based on Information Loss Minimization

نویسندگان

Periklis Andritsos

Vassilios Tzerpos

چکیده

The majority of the algorithms in the software clustering literature utilize structural information in order to decompose large software systems. Other approaches, such as using £le names or ownership information, have also demonstrated merit. However, there is no intuitive way to combine information obtained from these two different types of techniques. In this paper, we present an approach that combines structural and non-structural information in an integrated fashion. LIMBO is a scalable hierarchical clustering algorithm based on the minimization of information loss when clustering a software system. We apply LIMBO to two large software systems in a number of experiments. The results indicate that this approach produces valid and useful clusterings of large software systems. LIMBO can also be used to evaluate the usefulness of various types of non-structural information to the software clustering process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering Categorical Data based on Information Loss Minimization

As the size of databases continues to grow, understanding their structure gets more difficult. This, together with the lack of documentation and the unavailability of the original designers of the database adds further difficulty to the job of researchers and professionals to understand the structure of large and complex databases. At the same time, data sources are distributed over several sit...

متن کامل

Optimal Capacitor Allocation in Radial Distribution Networks for Annual Costs Minimization Using Hybrid PSO and Sequential Power Loss Index Based Method

In the most recent heuristic methods, the high potential buses for capacitor placement are initially identified and ranked using loss sensitivity factors (LSFs) or power loss index (PLI). These factors or indices help to reduce the search space of the optimization procedure, but they may not always indicate the appropriate placement of capacitors. This paper proposes an efficient approach for t...

متن کامل

Regularized Co-Clustering on Manifold

Co-clustering is to partition rows and columns of a matrix simultaneously. It has been an important research field in data mining and machine learning. It is preferred over traditional homogeneous clustering techniques in many real applications. In this paper, we present a co-clustering algorithm based on local information and regularization. The algorithm seeks to preserve the local intrinsic ...

متن کامل

Experimental Evaluation of Algorithmic Effort Estimation Models using Projects Clustering

One of the most important aspects of software project management is the estimation of cost and time required for running information system. Therefore, software managers try to carry estimation based on behavior, properties, and project restrictions. Software cost estimation refers to the process of development requirement prediction of software system. Various kinds of effort estimation patter...

متن کامل

Bilateral Weighted Fuzzy C-Means Clustering

Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Software Clustering based on Information Loss Minimization

نویسندگان

چکیده

منابع مشابه

Clustering Categorical Data based on Information Loss Minimization

Optimal Capacitor Allocation in Radial Distribution Networks for Annual Costs Minimization Using Hybrid PSO and Sequential Power Loss Index Based Method

Regularized Co-Clustering on Manifold

Experimental Evaluation of Algorithmic Effort Estimation Models using Projects Clustering

Bilateral Weighted Fuzzy C-Means Clustering

عنوان ژورنال:

اشتراک گذاری